摘要 :
Aiming at the efficiency and implementation of in‐network cache in Named Data Network, an in‐network cache method based on the programmable data plane was proposed. Set the cache port on the programmable data plane, when the cac...
展开
Aiming at the efficiency and implementation of in‐network cache in Named Data Network, an in‐network cache method based on the programmable data plane was proposed. Set the cache port on the programmable data plane, when the cache policy determined that the content needs to be cached, set the output port as the cache port and forward it to the cache unit for caching. According to the different requirements of different contents on Quality of Service, the request contents were classified. And the cache probability of classified contents was calculated based on the content popularity and the cache cooperation information between nodes. The cache threshold was dynamically set according to the state of cache space occupancy and compared with the cache probability to determine whether to cache the contents. Experimental results show that the proposed cache placement strategy can effectively improve the cache hit ratio and reduce the average transmission latency compared with the traditional cache placement strategy.
收起
摘要 :
Caches use data very differently than main memory does, so DRAM caches can have dramatically different refresh requirements. Making canonical assumptions about retention times in DRAM can be drastic overkill within the cache conte...
展开
Caches use data very differently than main memory does, so DRAM caches can have dramatically different refresh requirements. Making canonical assumptions about retention times in DRAM can be drastic overkill within the cache context. Using standard refresh rates may be unnecessary, and can be a significant waste of cache utilization and power. In this article, we view "retention time" in a new way by using statistical populations more appropriate for caches, and we suggest uses of a cache's inherent error- control mechanisms to reduce refresh rates by several orders of magnitude.
收起
摘要 :
Successfully integrating cloud storage as a primary storage layer in the I/O stack is highly challenging. This is essentially due to two inherent critical issues: the high and variant cloud I/O latency and the per-I/O pricing mode...
展开
Successfully integrating cloud storage as a primary storage layer in the I/O stack is highly challenging. This is essentially due to two inherent critical issues: the high and variant cloud I/O latency and the per-I/O pricing model of cloud storage. To minimize the associated latency and monetary cost with cloud I/Os, caching is a crucial technology, as it directly influences how frequently the client has to communicate with the cloud. Unfortunately, current cloud caching schemes are mostly designed to optimize miss reduction as the sole objective and only focus on improving system performance while ignoring the fact that various cache misses could have completely distinct effects in terms of latency and monetary cost.In this article, we present a cost-aware caching scheme, called GDS-LC, which is highly optimized for cloud storage caching. Different from traditional caching schemes that merely focus on improving cache hit ratios and the classic cost-aware schemes that can only achieve a single optimization target, GDS-LC offers a comprehensive cache design by considering not only the access locality but also the object size, associated latency, and price, aiming at enhancing the user experience with cloud storage from two aspects: access latency and monetary cost. To achieve this, GDS-LC virtually partitions the cache space into two regions: a high-priority latency-aware region and a low-priority price-aware region. Each region is managed by a cost-aware caching scheme, which is based on GreedyDual-Size (GDS) and designed for a cloud storage scenario by adopting clean-dirty differentiation and latency normalization. The GDS-LC framework is highly flexible, and we present a further enhanced algorithm, called GDS-LCF, by incorporating access frequency in caching decisions. We have built a prototype to emulate a typical cloud client cache and evaluate GDS-LC and GDS-LCF with Amazon Simple Storage Services (S3) in three different scenarios: local cloud, Internet cloud, and heterogeneous cloud. Our experimental results show that our caching schemes can effectively achieve both optimization goals: low access latency and low monetary cost. It is our hope that this work can inspire the community to reconsider the cache design in the cloud environment, especially for the purpose of integrating cloud storage into the current storage stack as a primary layer.
收起
摘要 :
Solid State Drives (SSDs) have been extensively deployed as the cache of hard disk-based storage systems. The SSD-based cache generally supplies ultra-large capacity, whereas managing so large a cache introduces excessive memory o...
展开
Solid State Drives (SSDs) have been extensively deployed as the cache of hard disk-based storage systems. The SSD-based cache generally supplies ultra-large capacity, whereas managing so large a cache introduces excessive memory overhead, which in turn makes the SSD-based cache neither cost-effective nor energy-efficient. This work targets to reduce the memory overhead introduced by the replacement policy of SSD-based cache. Traditionally, data structures involved in cache replacement policy reside in main memory. While these in-memory data structures are not suitable for SSD-based cache any more since the cache is much larger than ever. We propose a memory-efficient framework which keeps most data structures in SSD while just leaving the memory-efficient data structure (i.e., a new bloom proposed in this work) in main memory. Our framework can be used to implement any LRU-based replacement policies under negligible memory overhead. We evaluate our proposals via theoretical analysis and prototype implementation. Experimental results demonstrate that, our framework is practical to implement most replacement policies for large caches, and is able to reduce the memory overhead by about .
收起
摘要 :
A computer has three logical systems. The central processing unit
(CPU) that processes data according to instructions. The memory and
storage system that stores instructions and data (it can be a volatile
temporary memory or a per...
展开
A computer has three logical systems. The central processing unit
(CPU) that processes data according to instructions. The memory and
storage system that stores instructions and data (it can be a volatile
temporary memory or a permanent storage system like hard disks). Lastly,
the input/output system that is responsible for communication between
the computer and external entities. The memory and storage system is the
primary concern of this article, specifically cache memory. It stores
frequently accessed instructions and data for the CPU. Cache is from the
French word “cachet” which means “to hide.” This
is a reference to early cache implementations where the cache was
invisible to the user and CPU. Cache memory is important; it bridges the
gap in capabilities between the CPU and main memory. CPUs have been
rapidly getting faster. Cache memory has advanced considerably since
Wilkes in 1965 proposed a two-level main store: one conventional and the
other unconventional “slave” memory. It has now become a
conventional component of high speed computing, with increasing
sophistication and size. The performance of cache is critical to overall
system processing ability and the ongoing research in this area is
attempting to reduce as much as possible the speed gap between the CPU
and the memory
收起
摘要 :
Modern key-value stores, object stores, Internet proxy caches, and Content Delivery Networks (CDN) often manage objects of diverse sizes, e.g., blobs, video files of different lengths, images with varying resolutions, and small do...
展开
Modern key-value stores, object stores, Internet proxy caches, and Content Delivery Networks (CDN) often manage objects of diverse sizes, e.g., blobs, video files of different lengths, images with varying resolutions, and small documents. In such workloads, size-aware cache policies outperform size-oblivious algorithms. Unfortunately, existing size-aware algorithms tend to be overly complicated and computationally expensive. Our work follows a more approachable pattern; we extend the prevalent (size-oblivious) TinyLFU cache admission policy to handle variable-sized items. Implementing our approach inside two popular caching libraries only requires minor changes. We show that our algorithms yield competitive or better hit-ratios and byte hit-ratios compared to the state-of-the-art size-aware algorithms such as AdaptSize, LHD, LRB, and GDSF. Further, a runtime comparison indicates that our implementation is faster by up to 3x compared to the best alternative, i.e., it imposes a much lower CPU overhead.
收起
摘要 :
Query result caching is a crucial technique employed in search engines, reducing the response time and load of the search engines. As search engines continuously update their indexes, the query results in long-lived cache entries ...
展开
Query result caching is a crucial technique employed in search engines, reducing the response time and load of the search engines. As search engines continuously update their indexes, the query results in long-lived cache entries may become stale. It is important to provide the refresh mechanism to enhance the degree of freshness of cached results. We present a pre-judgment approach to improve the freshness of the result cache and design an incomplete allocation algorithm. We introduce the query-Time-to-live (TTL) and term-TTL structure to pre-judge the result cache. The query-TTL is used to pre-check the likelihood of a cache hit and term-TTL is applied to maintain all terms of the latest posting list. For the cache structure, we design a Queue-Hash structure and develop the corresponding incomplete allocation algorithm. The preliminary results demonstrate that our approaches can improve the freshness of cached results and decrease processing overhead compared with no pre-judgment approaches.
收起
摘要 :
In-memory computing utilizes RAM as a storage or a database in computer systems that require very fast response time. RAM-based SSD and RAM-SSD fusion devices are block devices that use RAM as the main storage space. The page cach...
展开
In-memory computing utilizes RAM as a storage or a database in computer systems that require very fast response time. RAM-based SSD and RAM-SSD fusion devices are block devices that use RAM as the main storage space. The page cache in Linux causes performance degradation for these RAM-basedblock devices. Direct I/O is a means of avoiding the page cache but has some restrictions such as block alignment. This paper proposes a byte direct I/O that allows all I/O requests to bypass the page cache without modifying the application program that exploits buffered I/O. The byte directI/O has been implemented in the Linux kernel, and performance has been evaluated by existing benchmark programs.
收起
摘要 :
Direct-mapped caches are defined, and it is shown that trends toward larger cache sizes and faster hit times favor their use. The arguments are restricted initially to single-level caches in uniprocessors. They are then extended t...
展开
Direct-mapped caches are defined, and it is shown that trends toward larger cache sizes and faster hit times favor their use. The arguments are restricted initially to single-level caches in uniprocessors. They are then extended to two-level cache hierarchies. How and when these arguments for caches in uniprocessors apply to caches in multiprocessors are also discussed.
收起
摘要 :
This article describes the algorithm, implementation, and deployment experience of CacheSack, the admission algorithm for Google datacenter flash caches. CacheSack minimizes the dominant costs of Google's datacenter flash caches: ...
展开
This article describes the algorithm, implementation, and deployment experience of CacheSack, the admission algorithm for Google datacenter flash caches. CacheSack minimizes the dominant costs of Google's datacenter flash caches: disk IO and flash footprint. CacheSack partitions cache traffic into disjoint categories, analyzes the observed cache benefit of each subset, and formulates a knapsack problem to assign the optimal admission policy to each subset. Prior to this work, Google datacenter flash cache admission policies were optimized manually, with most caches using the Lazy Adaptive Replacement Cache algorithm. Production experiments showed that CacheSack significantly outperforms the prior static admission policies for a 7.7% improvement of the total cost of ownership, as well as significant improvements in disk reads (9.5% reduction) and flash wearout (17.8% reduction).
收起